Infection dynamics of COVID-19 virus under lockdown and reopening

Motivated by COVID-19, we develop and analyze a simple stochastic model for the spread of disease in human population. We track how the number of infected and critically ill people develops over time in order to estimate the demand that is imposed on the hospital system. To keep this demand under control, we consider a class of simple policies for slowing down and reopening society and we compare their efficiency in mitigating the spread of the virus from several different points of view. We find that in order to avoid overwhelming of the hospital system, a policy must impose a harsh lockdown or it must react swiftly (or both). While reacting swiftly is universally beneficial, being harsh pays off only when the country is patient about reopening and when the neighboring countries coordinate their mitigation efforts. Our work highlights the importance of acting decisively when closing down and the importance of patience and coordination between neighboring countries when reopening.

Motivated by COVID-19, we develop and analyze a simple stochastic model for the spread of disease in human population. We track how the number of infected and critically ill people develops over time in order to estimate the demand that is imposed on the hospital system. To keep this demand under control, we consider a class of simple policies for slowing down and reopening society and we compare their efficiency in mitigating the spread of the virus from several different points of view. We find that in order to avoid overwhelming of the hospital system, a policy must impose a harsh lockdown or it must react swiftly (or both). While reacting swiftly is universally beneficial, being harsh pays off only when the country is patient about reopening and when the neighboring countries coordinate their mitigation efforts. Our work highlights the importance of acting decisively when closing down and the importance of patience and coordination between neighboring countries when reopening.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the virus that causes the current pandemic of coronavirus disease 2019 . The infection was first identified in December 2019 in Wuhan (China) and has since spread globally. By November 2021 more than 250 million people have been tested positive for the virus and more than 5 million people have died from complications caused by the virus. The large majority of cases result in recovery after mild or no symptoms. The coronavirus pandemic has led to an unprecedented global response in quarantine measures, social distancing, travel restrictions and shutting down of economic activity.
We consider infection dynamics of coronavirus in a population of size N. The population represents a community (city, state, or a country). Initially, all people are uninfected. Then, we add one (or several) infected individuals and follow the stochastic trajectories of viral spread. The process advances in discrete time steps that represent days. Individuals are in different states describing their status of being: susceptible (S), exposed (E), mildly ill/infectious (I), critically ill (C), and recovered/removed (R), see Fig. 1a. We assume that critical cases are hospitalized. The infection spreads whenever a susceptible person comes in contact with an infectious person. In this case, the infection is transmitted with probability p (see Fig. 1b). We denote the number of daily contacts per person by k 0 .
We assume that the community has a capacity c of hospital beds to treat the critical cases. When left unregulated, the disease would surge through the community and exceed that capacity c (see Fig. 1c). A country can mitigate the spread of the disease by introducing various non-pharmaceutical interventions, such as enforcing social distancing or shutting down non-essential businesses. We model such interventions by decreasing the number k 0 of daily contacts of an individual to a value k < k 0 . We call the regime when the interventions are put in place a lockdown. www.nature.com/scientificreports/ Here we study the key question of how quickly and how severely a community should lock down, and how patient it should be before reopening again. To that end, we consider a simple class of policies characterized by three parameters τ, k, d . The parameter τ is the number of critical cases that triggers the community to enter the lockdown. It models the cautiousness or agility of the policy. The parameter k is the number of daily contacts per person in a lockdown. It models the severity of the policy. Finally, the parameter d is the number of days the community needs to spend with critical cases below the trigger threshold τ before the lockdown is lifted. It models the patience of the policy. In other words, a policy P(k, τ , d) locks down to k < k 0 daily contacts per person once the number of critically ill individuals exceeds a given trigger threshold τ , and it reopens to k 0 daily contacts once the number of critically ill individuals remains under that threshold τ for d consecutive days. We evaluate the performance of such policies with respect to several measures. For instance, we consider the peak load C max , which is the expected number of critical cases at its maximum, and the overflow probability p fail , which is the probability that the peak load C max exceeds the hospital bed capacity c available to treat the critically ill cases.
We find that the best performing policies are those that quickly transition to a severe lockdown and that are patient about reopening. However, when either quick or severe action is not feasible, as is often the case, a country can compensate by pressing more along the other dimension. This gives rise to a spectrum of possible policies for closing down and reopening the society. At one end of this spectrum, a moderate low-trigger policy ( P ML ) imposes a gentle lockdown at the first sign of the onset of the disease. At the other end, a severe high-trigger policy ( P SH ) remains open till the latest moment possible and then imposes a harsh lockdown. We find that, though comparable in some regards, those two policies are very different in terms of their long-term behavior and in terms of their sensitivity to policies employed by neighboring countries. Specifically, we argue that the moderate low-trigger policies are preferable assuming that the countries are able to coordinate and that an efficient vaccine is not distributed soon, whereas the severe high-trigger policies are preferable otherwise.

Model
In order to describe the spread of COVID-19, we consider a stochastic, discrete-time, individual-based SIR-like model.

Disease progression within an individual.
We consider the following simple model. Initially, a typical individual is susceptible (S) and can contract the disease after contact with an infectious individual (see Fig. 1a). Immediately upon contracting the disease, an individual becomes exposed (E) and does not yet spread it. Later they become infectious (I) as they develop mild condition and then they either recover (R) or become critically ill (C). Critical individuals are hospitalized and isolated and they occupy part of the capacity c of the health system. Eventually, they either recover or die (R). We assume that recovered individuals acquire immunity. Upon contracting the disease, each transition occurs after a number of days that is given by a corresponding random variable X E→I , X I→R , X I→C , X C→R . For concreteness, we set the values based on the data on COVID-19: [17][18][19]21,[38][39][40] the pre-infectious period is X E→I = 2 days and the individual recovers from a mild condition upon X I→R = 10 days. During each of those 10 days, an individual might become critically ill with probability 1% (hence X I→C is exponentially distributed with parameter 1% and roughly 10% of cases become critical). The critical cases  Figure 1. The disease spread without an intervention. (a) Two days after being exposed to the disease (E), the individual becomes infectious (I) as they develop mild condition. If a critical condition develops (C), the individual is hospitalized and isolated. We assume that all surviving individuals (R) acquire immunity. (b) A population of N individuals. Each day, an individual meets k other individuals. During a single meeting with an infectious person, a susceptible individual contracts a disease with a transmission probability p. (c) Without intervention, the disease surges through the community and the critical cases (curve) at its peak C max exceed the available hospital bed capacity c (dashed lines 41,42  Disease spread through the population. We consider a population of N individuals. Each day, each individual comes in contact with k 0 other individuals (see Fig. 1b). When a susceptible individual (S) meets an infectious individual (I), he or she contracts the disease with transmission probability p and becomes exposed (E). Note that since we assume that recovered individuals acquire immunity, the disease eventually gets eradicated-at the latest, this happens once everyone is exposed and later recovers. We denote the number of critically ill individuals at day t by C(t) and use analogous notation for other conditions. For concreteness 41-43 we consider a population of N = 20,000 individuals, a health system capacity of 2.8 beds per 1000 individuals leading to c = 56 beds in total, k 0 = 15 daily interactions per person, and a transmission probability p = 2% . All in all, this gives the epidemiological basic reproductive ratio R 0 equal to roughly R 0 = k 0 · p · X I→R = 2.9.
Policies. We consider a 3-parameter class of policies that a country can use to mitigate the spread of the disease. The policies toggle between two regimes-the default open regime and the temporary lockdown regime (see Fig. 2a). The policy P(τ , k, d) can be efficiently described using three parameters τ, k, d that describe how soon and severely the policy locks down, and how soon it reopens: 1. Once the number C(t) of critically ill cases exceeds the trigger threshold τ , the policy locks down by reducing the number of daily contacts per person from k 0 to k. 2. Once the number C(t) of critically ill cases remains under the trigger threshold τ for d consecutive days, the policy reopens by resetting the number of daily contacts per person back to k 0 .
The three parameters τ, k, d thus model three natural features of the policy: its "cautiousness", that is, how easily it is triggered into a lockdown; the "severity" of its lockdowns; and its "patience" when reopening, respectively. We remark that we chose the number of critically ill people C(t) rather than the number of infectious people I(t) since the latter is not so easily accessible to policymakers.
Performance of the policy. In order to evaluate the performance of a policy, we study the following quantities. Denote by M = max{C(t) | t ≥ 0} the random variable that corresponds to the number of critically ill people at their maximum, over the duration of the disease.
Open Lockdown , the country locks down to k < k 0 daily contacts whenever the number C(t) of critical cases exceeds a trigger threshold τ . It reopens (to k 0 daily contacts) once the number of critical cases stays below τ for d consecutive days. (b) We consider four different policies given by a combination of a trigger threshold (low trigger τ low = 3 , high trigger τ high = 12 ) and a lockdown severity (severe k low = 1.25 , moderate k high = 6 ), and common patience d = 10 days. (c) Representative runs under the four policies (for 800 days). While with the severe high-trigger policy P SH (top left) all peaks are similar in shape, with the moderate low-trigger policy P ML (bottom right) all subsequent peaks are much smaller than the first one. With the moderate high-trigger policy P MH (top right) the capacity is exceeded and with the severe low-trigger policy P SL (bottom left) the disease is quickly eradicated. www.nature.com/scientificreports/ 1. The (expected) peak load C max : that is, the expected number C max = E[M] of critically ill people at their maximum, over the duration of the disease. This represents the maximum demand on the health system (hospital beds). 2. The overflow probability p fail : that is, the probability p fail = Pr[M > c] of exceeding the available bed capacity c at some point throughout the course of the disease. 3. The total load C all : that is, the total cumulative number of critical cases, over the duration of the disease. This can be used to estimate the total number of deaths. 4. The total (expected) duration D of the lockdowns: that is, the total expected number of days spent in the lockdown regime, until the disease is eventually eradicated within the community.
In the Supplementary Information, we also consider another measure, namely the total (expected) overflow of the bed capacity E[ t≥0 max{C(t) − c, 0}] , see Supplementary Fig. 1. In all cases, the lower the quantity the better the policy. Hence, we can think of all the quantities as costs of the policy. We note that the first three quantities can be viewed as costs related to health of the population. For a fixed lockdown severity k, the fourth quantity can be viewed as an economic cost of imposing a lockdown of that severity, since a lockdown reduces the economic activity of a community.

Results
We evaluate the performance of the above defined policies. First, we do this for a single country. Later, we do this for two neighboring countries. Recall that a policy P(τ , k, d) is given by three parameters: the threshold number of critical cases τ that triggers the policy to toggle to lockdown and back ("trigger value"); the number k of daily contacts per person during a lockdown ("severity"); and the number d of days required to remain below the trigger threshold τ before the society can reopen ("patience"), see Fig. 2a.

Four example policies.
To illustrate the differences in performance of various policies, we first consider four specific policies that all share the patience parameter d = 10 days and that differ in the trigger value τ and in the severity k (see Fig. 2b). Specifically, in terms of the trigger value τ , we distinguish low-trigger policies ( τ low = 3 ) from high-trigger policies (τ high = 12) . Similarly, in terms of the number k of daily contacts in a lockdown, we distinguish severe policies (k low = 1.25) from moderate policies (k high = 6) . All in all, this yields 2 × 2 = 4 combinations P SL (severe, low-trigger), P SH (severe, high-trigger), P ML (moderate, low-trigger), and P MH (moderate, high-trigger). We observe that the policies substantially alter how the number of infected and critical cases evolves in time (see Fig. 2c). To explain the difference, it is instructive to think in terms of the effective reproductive number R e that determines whether the number of infected individuals in a population is quickly surging ( R e > 1 ), disappearing ( R e < 1 ) or changing slowly ( R e ≈ 1 ). Note that R e is not constant in time-it crucially depends on the current number k of daily contacts ( R e decreases as k decreases) and also on the percentage x of immune individuals ( R e decreases as x increases). In the open society (k large) and with no immune individuals ( x = 0 ) we have R 0 > 1 , and hence the disease initially spreads quickly.
Severe policies. Under the two severe policies, the number of cases in time follows the familiar spikes: each lockdown is so harsh that as long as it is in place, we have R e < 1 even when x = 0 . Therefore, a few days upon imposing the lockdown (the pre-infectious period) the infected cases rapidly drop, then the critical cases drop too and the disease can possibly get eradicated in some communities. This happens over a short period of time and only a few people acquire immunity ( x ≈ 0 ). When the trigger τ = τ high is high, the patience of d = 10 days is insufficient to eradicate the disease completely, the lockdown is lifted too early, a subsequent spike of similar shape is likely, and the whole cycle repeats several times. When the trigger τ = τ low is low, waiting for d = 10 days will typically suffice to eradicate the disease completely within the community and no subsequent spikes occur.
Moderate policies. Under the two moderate policies, the typical stochastic trajectories are different. The lockdown is so gentle that when x = 0 , we have R e ≈ 1 . Hence, upon imposing a moderate lockdown, the number of ill individuals becomes roughly constant in time. But as time goes by and the individuals progress through the disease stages and acquire immunity ( x > 0 ), the value of R e decreases and the disease starts to die out ( R e < 1 ). Therefore, compared to the severe policies, the first peak is substantially broader. (However, the taller it is, the less apparent this distinction is.) Crucially, if the lockdown is lifted too soon and another outbreak occurs later, once the same moderate lockdown is imposed again, the immune subpopulation ( x > 0 ) causes the disease to die out right away-in fact, it dies away faster and faster in every subsequent lockdown. Therefore, while the first outbreak might require a long lockdown phase, all subsequent outbreaks are dealt with promptly. In a sense, when the moderate lockdown is lifted for the first time, the population had already acquired herd immunity level for the interaction rate k of the moderate lockdown. This means that when the lockdown is put in place, the disease does not spread. However, once the lockdown is lifted and individuals start to interact more frequently, the disease could start spreading again. For our parameters, when the trigger τ = τ high is high, the lockdown is not strong enough and the bed capacity is (slightly) exceeded within the first peak. When the trigger τ = τ low is low, the critical cases stay safely below the bed capacity. In both cases, the patience d = 10 days is insufficient to completely eradicate the disease and subsequent peaks occur-all substantially smaller than the first one. www.nature.com/scientificreports/ How parameters affect the performance. In order to understand the role of the three parameters (trigger value τ , severity k , patience d ) on each of the four performance measures (peak size C max , overflow probability p fail , total number of critical cases C all , lockdown duration D ), we run exhaustive computer experiments. In Fig. 3, each row shows a different performance measure (y-axis) as a function of the number k of daily contacts in a lockdown (x-axis). The red dotted vertical line marks the number k ⋆ of daily contacts that corresponds to R e = 1 (when there are no immune individuals, x = 0 ). Within each row, the left panel shows policies that have low patience ( d = 7 days) and the right panel shows policies that have high patience ( d = 70 days). Within each panel, the blue curve shows the low-trigger policies ( τ low = 3 ) and the green curve shows the high-trigger policies ( τ high = 12).
Recall that for each performance measure, the lower the value, the better the performance. The effect of the trigger value can be seen by comparing the blue and the green curve: since the blue curve is typically lower, low-trigger policies are generally better than high-trigger policies. The effect of the lockdown severity can be seen by observing the performance curves as k decreases: since the curves decrease at lower k values, severe policies are generally better than moderate policies. The effect of the patience can be seen by comparing the left and the right panel: since the curves in the right panel are typically lower, patient policies are generally better than impatient policies.
There are two exceptions to those rules, both concerning the lockdown duration D : first, we observe that when k becomes too large, the duration D decreases. This is due to the fact that the lockdown becomes too weak and the disease quickly sweeps through the whole population. Second, we observe that when the patience is low and the lockdown is moderate, decreasing the trigger value τ actually leads to more time spent in the lockdown.
Key parameters for different regimes. Next, for each performance measure, we characterize which of the three parameters τ, k, d are key to substantially improving the performance and which of them are marginal.
Let k ⋆ be the number of daily contacts that corresponds to R e = 1 when there are no immune individuals ( x = 0 ). For our parameters, we have k ⋆ . = 5.3.
Peak size C max . We observe that (see Fig. 3a): τ: The low-trigger policies (blue curve) are consistently better than the high-trigger policies (green curve). k: For both trigger values, C max is roughly constant as long as k < k ⋆ but then it increases rapidly when k > k ⋆ . d: The left and the right panel are comparable.
Hence, the important insight is to have the severity below a threshold ( k < k ⋆ ) and to have the trigger τ low. The effects of the patience d and the severity k (given that k < k ⋆ ) are marginal.
The intuitive explanation is that when k > k ⋆ , the lockdown is so weak that the disease still continues to spread, even when the lockdown is put in place. Hence, having k < k ⋆ is key. On the other hand, given that k < k ⋆ , the actual value of k is not that important: when a lockdown is put in place, the peak size (and the moment when the peak occurs) have already been essentially determined, since most of the critical cases at the peak are due to individuals who have already been infected when the lockdown was put in place. Similarly, the patience d is not that important as it affects the number of peaks rather than their size. On the other hand, the trigger value τ is key: the nature of the exponential growth and the inherent delay due to pre-infectious period and non-critical infection translate the difference in trigger value τ to a difference in peak size C max .
Overflow probability p fail . We observe that (see Fig. 3b): τ: The low-trigger policies (blue curve) are consistently better than the high-trigger policies (green curve). k: For low-trigger policies, p fail exhibits a threshold behavior with respect to k . For high-trigger policies, p fail increases when k < k ⋆ and increases rapidly when k > k ⋆ . d: For high-trigger policies (green curve), increasing the patience d decreases p fail (when k < k ⋆ ).
Hence, the important insight is, again, to have the severity below a threshold ( k below k ⋆ or just very slightly above) and to have the trigger τ low. When the trigger τ is high, both decreasing k and increasing d help, but even the combined effect is negligible compared to the effect of having τ low.
The intuitive explanation is that, in large populations, most stochastic trajectories are qualitatively similar. Hence, even though the peak size is a random variable, it is narrowly concentrated around its average value. Thus, whenever the average peak size C max slightly exceeds the available bed capacity c , the overflow probability is almost 1. And, vice versa, whenever the average peak size C max is slightly lower than the available bed capacity c , the overflow probability is almost 0. In other words, increasing the bed capacity c typically does not decrease p fail -unless we exactly cross from the regime c < p fail to c > p fail , in which case it helps dramatically. In a sense, the overflow probability exaggerates the difference between C max and c.
Total critical cases C all . We observe that (see Fig. 3c): τ: When impatient (left panel), the low-trigger policies (blue curve) are consistently better than the high-trigger policies (green curve). www.nature.com/scientificreports/ In each panel, we vary the number k of daily contacts (x-axis) and consider the performance (cost) of the lowtrigger policies ( τ low = 3 , blue) and of the high-trigger policies ( τ high = 12 , green), when the patience parameter is low ( d = 7 days, left column) and high ( d = 70 , right column). The dotted red line shows the number k ⋆ of daily contacts that corresponds to the effective reproductive rate R e equal to 1 (when no individuals have yet recovered). Generally speaking, it is beneficial to have the trigger value τ low (blue curves are below green ones), to impose severe rather than moderate lockdown (all curves are increasing functions of k for k ≤ k ⋆ ), and to be patient (the curves in the right panels are lower). For C max and p fail , the key is to have the trigger value τ low. For C all and D , the key is to have the patience d high. Hence, the important insight is to have the patience d high. In order to achieve comparable results with low patience, one has to have a low trigger τ and a very severe lockdown. The intuitive explanation is that high patience d is key because it decreases the chance of a premature reopening and thereby reduces the number of times a lockdown has to be put in place. With a sufficiently high patience, the disease gets eradicated within the community upon completing the first peak. With low patience d (and, thus, many peaks), the only way to avoid many critical cases is to make sure each peak is small. This requires a low trigger value τ and severe lockdown k every time the trigger value is reached. As a final remark, note that when impatient and high-trigger, the lockdowns have to be extremely severe to help even a little: this is the only way to get at least some hope that the disease gets eradicated by the short time the lockdown will be lifted.
Total lockdown duration D. Since a severe lockdown is very different from a moderate lockdown, here we focus our comparison on only those lockdowns that have the same severity. We observe that (see Fig. 3d): τ: When impatient (left panel), low-trigger policies (blue curve) are better assuming k is small, otherwise high-trigger policies (green curve) are better. When patient (right panel), low-and high-trigger policies are comparable. d: Increasing the patience d decreases D.
Hence, for severe lockdowns ( k < k ⋆ ), the important insight is, again, to have the patience d high. In those cases, the effect of the trigger value is marginal.
The intuitive explanation for severe lockdowns of fixed severity k < k ⋆ is that the key to minimizing the total lockdown duration is to minimize the probability q of a subsequent outbreak (that would occur if the lockdown were lifted too early). Since in a severe lockdown, the numbers of infected and exposed individuals decay exponentially ( k < k ⋆ ), there are two ways to decrease q (and thereby D ) by a constant factor: either to increase the patience by a constant number of days, or to decrease the trigger value τ by a constant factor. The former is less costly, so in this case having high patience is the single most important aspect.
Summary. Here we summarize three findings from the above paragraphs. First, any successful policy must impose lockdowns that restrict the number k of daily contacts under the threshold k ⋆ . Second, in terms of minimizing either the peak size C max or the overflow probability p fail , it is crucial to employ a policy with low trigger threshold τ . Third, in terms of minimizing either the total number C all of critical cases or the lockdown duration D , it is crucial to employ policies with high patience d. The precise optimal value of patience depends on the severity of the lockdown and on the trigger threshold τ. Two countries. Some interaction among communities is inevitable. While the inter-community interaction can be limited by closing borders between countries and imposing quarantine upon entry, it can not be completely disregarded. We study how the policies perform in the environment where different communities might employ different policies.
To model this, we consider two communities Country 1 and Country 2 that experience the onset of the disease at the same time. Occasionally, individuals from different countries meet. Namely, we assume that for each individual, a small portion q ∈ (0, 1) of their interactions are with individuals in the other country.
We consider two very different policies and study how their performance depends on the policy employed by the neighboring country and on the interaction rate q between the two countries (see Fig. 4).
Specifically, we consider a moderate low-trigger policy P ML = P(τ = 3, k = 6, d = 54) and a severe hightrigger policy P SH = P(τ = 12, k = 1.25, d = 27) . The parameters τ and k are chosen such that both policies have the same probability 10% of exceeding the US hospital bed capacity within their first peak. The parameter d is chosen such that, on average, 90% of the critical cases occur within the first peak. In other words, the policy reopens when it expects that 90% of all the critical cases have already been hospitalized.
When both countries employ the same policy, the performance is essentially the same as for a larger country employing that policy. Also, in the limit q → 0 the two countries do not interact and the performance of a country is independent of the policy of the neighbor (this remains true for q < 10 −5 ). However, when q is nonnegligible, one country uses P ML ("P ML -country") and the other one uses P SH ("P SH -country"), the two policies clash. Specifically, we make the following observations about the overflow probability p fail and the average peak size C max (see Fig. 4): 1. For q small, the green curve increases as q increases: note that the infectious subpopulation of the neighboring P ML -country is non-negligible for an extended period of time (at least throughout the first broad peak). Thus, as q increases, the individuals in the P SH -country get repeatedly infected due to interactions with the P ML -country. Most such new infections cause a new spike for the P SH -country. Each such spike might exceed the previously largest peak and/or the available capacity. (Moreover, each such spike leads to new critical cases and it has to be contained by another lockdown phase so it is costly in terms of C all and D too.) This effect is visible for interaction rates as small as q = 10 −4 . 2. For q large, the blue and green curve decrease as q increases: two countries employing different policies typically reach their peaks at a different point in time. Thus, when one country is peaking, the other country likely www.nature.com/scientificreports/ (c-f) The performance of a P ML policy against a P SH policy (blue), P SH vs. P ML (green), P ML vs. P ML (yellow) and P SH vs. P SH (red), averaged over 10 4 runs. We vary the interaction rate q on a log-scale and measure: (c) the overflow probability p fail (95% confidence intervals are shaded); (d) the expected peak size C max ; (e) the total number C all of critical cases; and (f) the lockdown duration D . A country employing P SH does great when its neighbor employs P SH (red) but bad when the neighbor employs P ML (green). A country employing P ML does comparably well, regardless of whether the neighbor employs P ML (yellow) or P SH (blue). www.nature.com/scientificreports/ has fewer infectious individuals and an interaction with that country will help alleviate the size of the peak in the first country. This, in turn, decreases the maximum peak load C max and the overflow probability p fail . This effect is visible roughly for q > 10 −3 (blue curve), resp. q > 10 −1 (green curve). 3. The yellow and the red curves are roughly constant, except that the yellow curve goes down in terms of the overflow probability when q is large: when two neighboring countries employ the same policy, the extra occasional mixing due to interacting individuals makes both countries behave in a slightly more average way. This does not change the expected size of the peak, and hence C max is constant, regardless of the interaction rate q . However, this process of "averaging out" does make the extreme behavior, such as overflowing of the available hospital capacity, somewhat less likely. This effect of diminished overflow probability is observed for two P ML countries when q > 10 −2 .
This leads to an interesting phenomenon resembling a social dilemma 44,45 . Consider two countries with a nonnegligible interaction rate q > 10 −3 . First, if either of the two countries primarily cares about keeping the overflow probability low then that country would employ the policy P ML rather than the policy P SH , no matter which policy the other country is using (indeed blue is below red and yellow is below green). Second, given that one country uses the policy P ML , the other country will use the policy P ML too (yellow is below green). In terms of the overflow probability alone, this is an acceptable outcome (red and yellow are comparable) but in terms of the total number of critical cases C all , this is undesirable: by employing the policy P ML rather than P SH , both countries increase the total number of their critical cases (and, possibly, deaths) by an order of magnitude.

Discussion
Motivated by the COVID-19 pandemic, we studied a simple stochastic model of a disease progression in a population of interacting individuals. We focused on a 3-parameter family of policies that can be used to mitigate the disease spread and evaluated the performance of those policies with respect to several measures, such as the number of critical cases at its maximum or the probability that this number exceeds the available hospital bed capacity. The three parameters describing the policies correspond to the agility when closing down, the severity of the lockdown, and the patience when reopening. We identified which parameters are important in which regime and explained why some policies are performing better than others. We note that understanding the dynamics of periodic lockdown is important in case the virus escapes vaccination or for future epidemics. We highlight two different types of realistic policies, called moderate low-trigger ( P ML ) and severe high-trigger ( P SH ). With both policies, the probability p fail of ever exceeding the available hospital beds is kept below a specified threshold (here arbitrarily set to 10%), but the two policies are very different: the P SH policy is characterized by imposing a harsh, short lockdown ("severe") at the last moment possible ("high-trigger"), whereas the P ML policy is proactive and imposes gentle, longer lockdowns ("moderate") at the first signal of an approaching outbreak ("low-trigger").
Due to the above differences, both policies have their advantages and disadvantages. The P SH policy minimizes the total number of critically ill cases and the total amount of time spent in lockdown. However, the lockdowns are severe and the society is more susceptible to any subsequent outbreaks. To avoid such recurring outbreaks, the authorities must be patient when reopening and any neighboring countries must coordinate when releasing their measures. The P ML policy maintains its moderate lockdown for substantially longer but its performance is substantially more robust with respect to how soon the society reopens and with respect to what policies are employed by the neighboring countries. Moreover, upon completing the first lockdown phase, all subsequent lockdowns (if any) are shorter and involve substantially fewer critical cases than the first phase. Thus, the P ML policy can be seen as minimizing the long-term risks under the pessimistic scenario that an ultimate long-term solution (such as a majority of the population being vaccinated) is not achieved any time soon. On the other hand, the P SH policy is optimistic about the future and optimizes the short-term performance.
Our setup is intentionally simplified in many regards such as the disease progression within an individual, the disease spread throughout the population, and the class of policies we consider for closing down and reopening. We highlight four possible extensions worth pursuing in subsequent work. First, regarding the disease progression within an individual, one can distinguish more types of individuals (e.g. those who require only hospital beds and those who moreover require ventilation) and lift the assumption that the individuals acquire lifelong immunity. We note that our model already implicitly includes asymptomatic carriers, since those are equivalent to individuals who developed a mild condition and then recovered. Second, regarding the disease spread in a population, one can consider a population structure, e.g. described by a graph whose edge weights determine the daily pairwise transmission probabilities. This would allow one to investigate the effects of localized interventions such as contact tracing. Third, one can consider more complicated policies, e.g. policies that allow for a gradual reopening or policies that, when deciding whether and how much to reopen, take into account additional information, such as the situation in the neighboring countries and/or the outcomes of testing done earlier. Fourth, on top of considering the health viewpoint, one can incorporate the economic viewpoint by introducing an appropriate notion of economic cost of a lockdown of varying degree of severity.